Dataset statistics
| Number of variables | 18 |
|---|---|
| Number of observations | 2651861 |
| Missing cells | 2152547 |
| Missing cells (%) | 4.5% |
| Duplicate rows | 2452 |
| Duplicate rows (%) | 0.1% |
| Total size in memory | 364.2 MiB |
| Average record size in memory | 144.0 B |
Variable types
| Numeric | 10 |
|---|---|
| Categorical | 8 |
| Dataset has 2452 (0.1%) duplicate rows | Duplicates |
Start_Time has a high cardinality: 2582957 distinct values | High cardinality |
End_Time has a high cardinality: 2578441 distinct values | High cardinality |
Weather_Condition has a high cardinality: 121 distinct values | High cardinality |
Temperature(F) has 45294 (1.7%) missing values | Missing |
Humidity(%) has 48306 (1.8%) missing values | Missing |
Pressure(in) has 38857 (1.5%) missing values | Missing |
Visibility(mi) has 53002 (2.0%) missing values | Missing |
Wind_Direction has 40241 (1.5%) missing values | Missing |
Wind_Speed(mph) has 344548 (13.0%) missing values | Missing |
Precipitation(in) has 1529311 (57.7%) missing values | Missing |
Weather_Condition has 52931 (2.0%) missing values | Missing |
Distance(mi) is highly skewed (γ1 = 37.62493565) | Skewed |
Precipitation(in) is highly skewed (γ1 = 49.31946349) | Skewed |
Start_Time is uniformly distributed | Uniform |
End_Time is uniformly distributed | Uniform |
Distance(mi) has 2215378 (83.5%) zeros | Zeros |
Wind_Speed(mph) has 164999 (6.2%) zeros | Zeros |
Precipitation(in) has 937421 (35.3%) zeros | Zeros |
Reproduction
| Analysis started | 2021-05-02 17:00:37.590588 |
|---|---|
| Analysis finished | 2021-05-02 17:08:42.934028 |
| Duration | 8 minutes and 5.34 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
TMC
Real number (ℝ≥0)
| Distinct | 21 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 208.2827052 |
|---|---|
| Minimum | 200 |
| Maximum | 406 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | 200 |
|---|---|
| 5-th percentile | 201 |
| Q1 | 201 |
| median | 201 |
| Q3 | 201 |
| 95-th percentile | 241 |
| Maximum | 406 |
| Range | 206 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 21.24599285 |
|---|---|
| Coefficient of variation (CV) | 0.1020055546 |
| Kurtosis | 39.42717498 |
| Mean | 208.2827052 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.26517742 |
| Sum | 552336783 |
| Variance | 451.3922124 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 201 | 2215577 | |
| 241 | 272196 | 10.3% |
| 245 | 50124 | 1.9% |
| 229 | 22863 | 0.9% |
| 203 | 18010 | 0.7% |
| 222 | 13323 | 0.5% |
| 244 | 12982 | 0.5% |
| 406 | 12652 | 0.5% |
| 246 | 8751 | 0.3% |
| 343 | 7890 | 0.3% |
| Other values (11) | 17493 | 0.7% |
| Value | Count | Frequency (%) |
| 200 | 66 | < 0.1% |
| 201 | 2215577 | |
| 202 | 6417 | 0.2% |
| 203 | 18010 | 0.7% |
| 206 | 1365 | 0.1% |
| Value | Count | Frequency (%) |
| 406 | 12652 | |
| 351 | 6 | < 0.1% |
| 343 | 7890 | |
| 341 | 657 | < 0.1% |
| 339 | 1003 | < 0.1% |
Severity
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.2 MiB |
| 2 | |
|---|---|
| 3 | |
| 4 | 9269 |
| 1 | 1115 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2651861 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 2 |
| 3rd row | 2 |
| 4th row | 3 |
| 5th row | 2 |
| Value | Count | Frequency (%) |
| 2 | 1754544 | |
| 3 | 886933 | |
| 4 | 9269 | 0.3% |
| 1 | 1115 | < 0.1% |
| Value | Count | Frequency (%) |
| 2 | 1754544 | |
| 3 | 886933 | |
| 4 | 9269 | 0.3% |
| 1 | 1115 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1754544 | |
| 3 | 886933 | |
| 4 | 9269 | 0.3% |
| 1 | 1115 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2651861 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 2 | 1754544 | |
| 3 | 886933 | |
| 4 | 9269 | 0.3% |
| 1 | 1115 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2651861 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 2 | 1754544 | |
| 3 | 886933 | |
| 4 | 9269 | 0.3% |
| 1 | 1115 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2651861 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 2 | 1754544 | |
| 3 | 886933 | |
| 4 | 9269 | 0.3% |
| 1 | 1115 | < 0.1% |
| Distinct | 2582957 |
|---|---|
| Distinct (%) | 97.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.2 MiB |
| 2018-11-25 01:22:49 | 53 |
|---|---|
| 2018-11-12 00:37:27 | 40 |
| 2018-12-18 07:11:45 | 37 |
| 2016-04-10 08:59:26 | 35 |
| 2017-09-09 09:03:14 | 23 |
| Other values (2582952) |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
Characters and Unicode
| Total characters | 50385359 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2521389 ? |
|---|---|
| Unique (%) | 95.1% |
Sample
| 1st row | 2016-02-08 05:46:00 |
|---|---|
| 2nd row | 2016-02-08 06:07:59 |
| 3rd row | 2016-02-08 06:49:27 |
| 4th row | 2016-02-08 07:23:34 |
| 5th row | 2016-02-08 07:39:07 |
| Value | Count | Frequency (%) |
| 2018-11-25 01:22:49 | 53 | < 0.1% |
| 2018-11-12 00:37:27 | 40 | < 0.1% |
| 2018-12-18 07:11:45 | 37 | < 0.1% |
| 2016-04-10 08:59:26 | 35 | < 0.1% |
| 2017-09-09 09:03:14 | 23 | < 0.1% |
| 2017-09-06 15:52:36 | 22 | < 0.1% |
| 2019-12-17 06:32:11 | 22 | < 0.1% |
| 2016-06-12 10:07:37 | 22 | < 0.1% |
| 2016-05-21 08:30:42 | 21 | < 0.1% |
| 2016-05-22 07:37:28 | 21 | < 0.1% |
| Other values (2582947) | 2651565 |
| Value | Count | Frequency (%) |
| 2018-11-06 | 3400 | 0.1% |
| 2018-11-09 | 3340 | 0.1% |
| 2018-11-05 | 3156 | 0.1% |
| 2017-09-15 | 3154 | 0.1% |
| 2018-11-02 | 3089 | 0.1% |
| 2018-10-26 | 3074 | 0.1% |
| 2018-10-19 | 3066 | 0.1% |
| 2018-10-18 | 3036 | 0.1% |
| 2019-11-15 | 3035 | 0.1% |
| 2017-09-29 | 3016 | 0.1% |
| Other values (87589) | 5272356 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 8958109 | |
| 1 | 7689838 | |
| 2 | 6412535 | |
| - | 5303722 | |
| : | 5303722 | |
| 2651861 | 5.3% | |
| 8 | 2185072 | 4.3% |
| 3 | 2145848 | 4.3% |
| 5 | 2101887 | 4.2% |
| 4 | 2048449 | 4.1% |
| Other values (3) | 5584316 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 37126054 | |
| Dash Punctuation | 5303722 | 10.5% |
| Other Punctuation | 5303722 | 10.5% |
| Space Separator | 2651861 | 5.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 8958109 | |
| 1 | 7689838 | |
| 2 | 6412535 | |
| 8 | 2185072 | 5.9% |
| 3 | 2145848 | 5.8% |
| 5 | 2101887 | 5.7% |
| 4 | 2048449 | 5.5% |
| 7 | 1998891 | 5.4% |
| 9 | 1966592 | 5.3% |
| 6 | 1618833 | 4.4% |
| Value | Count | Frequency (%) |
| - | 5303722 |
| Value | Count | Frequency (%) |
| 2651861 |
| Value | Count | Frequency (%) |
| : | 5303722 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 50385359 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 8958109 | |
| 1 | 7689838 | |
| 2 | 6412535 | |
| - | 5303722 | |
| : | 5303722 | |
| 2651861 | 5.3% | |
| 8 | 2185072 | 4.3% |
| 3 | 2145848 | 4.3% |
| 5 | 2101887 | 4.2% |
| 4 | 2048449 | 4.1% |
| Other values (3) | 5584316 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 50385359 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 8958109 | |
| 1 | 7689838 | |
| 2 | 6412535 | |
| - | 5303722 | |
| : | 5303722 | |
| 2651861 | 5.3% | |
| 8 | 2185072 | 4.3% |
| 3 | 2145848 | 4.3% |
| 5 | 2101887 | 4.2% |
| 4 | 2048449 | 4.1% |
| Other values (3) | 5584316 |
| Distinct | 2578441 |
|---|---|
| Distinct (%) | 97.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.2 MiB |
| 2018-11-25 02:51:02 | 46 |
|---|---|
| 2018-12-18 08:11:10 | 37 |
| 2016-10-14 19:50:00 | 24 |
| 2018-09-16 13:51:54 | 22 |
| 2017-09-07 05:22:04 | 21 |
| Other values (2578436) |
Length
| Max length | 19 |
|---|---|
| Median length | 19 |
| Mean length | 19 |
| Min length | 19 |
Characters and Unicode
| Total characters | 50385359 |
|---|---|
| Distinct characters | 13 |
| Distinct categories | 4 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2511388 ? |
|---|---|
| Unique (%) | 94.7% |
Sample
| 1st row | 2016-02-08 11:00:00 |
|---|---|
| 2nd row | 2016-02-08 06:37:59 |
| 3rd row | 2016-02-08 07:19:27 |
| 4th row | 2016-02-08 07:53:34 |
| 5th row | 2016-02-08 08:09:07 |
| Value | Count | Frequency (%) |
| 2018-11-25 02:51:02 | 46 | < 0.1% |
| 2018-12-18 08:11:10 | 37 | < 0.1% |
| 2016-10-14 19:50:00 | 24 | < 0.1% |
| 2018-09-16 13:51:54 | 22 | < 0.1% |
| 2017-09-07 05:22:04 | 21 | < 0.1% |
| 2018-03-28 09:35:05 | 21 | < 0.1% |
| 2020-04-10 20:21:52 | 21 | < 0.1% |
| 2016-10-14 16:30:00 | 18 | < 0.1% |
| 2016-10-14 18:30:00 | 16 | < 0.1% |
| 2019-04-05 07:52:58 | 16 | < 0.1% |
| Other values (2578431) | 2651619 |
| Value | Count | Frequency (%) |
| 2018-11-06 | 3401 | 0.1% |
| 2018-11-09 | 3350 | 0.1% |
| 2018-11-05 | 3145 | 0.1% |
| 2017-09-15 | 3132 | 0.1% |
| 2018-11-02 | 3082 | 0.1% |
| 2018-10-26 | 3079 | 0.1% |
| 2018-10-19 | 3059 | 0.1% |
| 2018-10-18 | 3042 | 0.1% |
| 2019-11-15 | 3039 | 0.1% |
| 2019-11-22 | 3028 | 0.1% |
| Other values (87546) | 5272365 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 9100940 | |
| 1 | 7751081 | |
| 2 | 6538682 | |
| - | 5303722 | |
| : | 5303722 | |
| 2651861 | 5.3% | |
| 8 | 2206844 | 4.4% |
| 3 | 2113947 | 4.2% |
| 9 | 2079685 | 4.1% |
| 5 | 1994678 | 4.0% |
| Other values (3) | 5340197 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 37126054 | |
| Dash Punctuation | 5303722 | 10.5% |
| Other Punctuation | 5303722 | 10.5% |
| Space Separator | 2651861 | 5.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 9100940 | |
| 1 | 7751081 | |
| 2 | 6538682 | |
| 8 | 2206844 | 5.9% |
| 3 | 2113947 | 5.7% |
| 9 | 2079685 | 5.6% |
| 5 | 1994678 | 5.4% |
| 4 | 1949696 | 5.3% |
| 7 | 1887334 | 5.1% |
| 6 | 1503167 | 4.0% |
| Value | Count | Frequency (%) |
| - | 5303722 |
| Value | Count | Frequency (%) |
| 2651861 |
| Value | Count | Frequency (%) |
| : | 5303722 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 50385359 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 9100940 | |
| 1 | 7751081 | |
| 2 | 6538682 | |
| - | 5303722 | |
| : | 5303722 | |
| 2651861 | 5.3% | |
| 8 | 2206844 | 4.4% |
| 3 | 2113947 | 4.2% |
| 9 | 2079685 | 4.1% |
| 5 | 1994678 | 4.0% |
| Other values (3) | 5340197 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 50385359 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 9100940 | |
| 1 | 7751081 | |
| 2 | 6538682 | |
| - | 5303722 | |
| : | 5303722 | |
| 2651861 | 5.3% | |
| 8 | 2206844 | 4.4% |
| 3 | 2113947 | 4.2% |
| 9 | 2079685 | 4.1% |
| 5 | 1994678 | 4.0% |
| Other values (3) | 5340197 |
Start_Lat
Real number (ℝ≥0)
| Distinct | 827046 |
|---|---|
| Distinct (%) | 31.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 36.10331858 |
|---|---|
| Minimum | 24.555269 |
| Maximum | 49.002201 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | 24.555269 |
|---|---|
| 5-th percentile | 28.304241 |
| Q1 | 33.148983 |
| median | 35.391338 |
| Q3 | 39.98476 |
| 95-th percentile | 43.181423 |
| Maximum | 49.002201 |
| Range | 24.446932 |
| Interquartile range (IQR) | 6.835777 |
Descriptive statistics
| Standard deviation | 4.839869987 |
|---|---|
| Coefficient of variation (CV) | 0.1340560972 |
| Kurtosis | -0.5485835472 |
| Mean | 36.10331858 |
| Median Absolute Deviation (MAD) | 3.571625 |
| Skewness | 0.08573729209 |
| Sum | 95740982.52 |
| Variance | 23.42434149 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 33.941364 | 539 | < 0.1% |
| 42.476501 | 534 | < 0.1% |
| 33.744976 | 530 | < 0.1% |
| 37.808498 | 505 | < 0.1% |
| 34.858925 | 493 | < 0.1% |
| 33.876289 | 434 | < 0.1% |
| 42.368423 | 433 | < 0.1% |
| 33.781532 | 432 | < 0.1% |
| 25.789072 | 429 | < 0.1% |
| 40.850067 | 415 | < 0.1% |
| Other values (827036) | 2647117 |
| Value | Count | Frequency (%) |
| 24.555269 | 1 | |
| 24.5574 | 1 | |
| 24.55987 | 1 | |
| 24.560246 | 1 | |
| 24.560688 | 1 |
| Value | Count | Frequency (%) |
| 49.002201 | 1 | < 0.1% |
| 49.000759 | 1 | < 0.1% |
| 48.999901 | 1 | < 0.1% |
| 48.999569 | 1 | < 0.1% |
| 48.998241 | 4 |
Start_Lng
Real number (ℝ)
| Distinct | 789293 |
|---|---|
| Distinct (%) | 29.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -93.3411788 |
|---|---|
| Minimum | -124.623833 |
| Maximum | -67.839745 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | -124.623833 |
|---|---|
| 5-th percentile | -122.079926 |
| Q1 | -105.962898 |
| median | -87.462791 |
| Q3 | -80.821632 |
| 95-th percentile | -73.884247 |
| Maximum | -67.839745 |
| Range | 56.784088 |
| Interquartile range (IQR) | 25.141266 |
Descriptive statistics
| Standard deviation | 16.23687298 |
|---|---|
| Coefficient of variation (CV) | -0.173951874 |
| Kurtosis | -1.007185342 |
| Mean | -93.3411788 |
| Median Absolute Deviation (MAD) | 9.255227 |
| Skewness | -0.6380036073 |
| Sum | -247527831.7 |
| Variance | 263.636044 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| -118.096634 | 535 | < 0.1% |
| -83.111794 | 534 | < 0.1% |
| -84.390343 | 529 | < 0.1% |
| -122.366852 | 511 | < 0.1% |
| -82.259857 | 494 | < 0.1% |
| -118.368263 | 472 | < 0.1% |
| -84.390869 | 455 | < 0.1% |
| -83.058128 | 434 | < 0.1% |
| -80.204353 | 430 | < 0.1% |
| -73.944817 | 419 | < 0.1% |
| Other values (789283) | 2647048 |
| Value | Count | Frequency (%) |
| -124.623833 | 1 | |
| -124.534439 | 1 | |
| -124.493149 | 1 | |
| -124.484421 | 1 | |
| -124.479179 | 1 |
| Value | Count | Frequency (%) |
| -67.839745 | 1 | |
| -67.841858 | 1 | |
| -68.060165 | 1 | |
| -68.14003 | 1 | |
| -68.380852 | 1 |
| Distinct | 3875 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.1986762127 |
|---|---|
| Minimum | 0 |
| Maximum | 441.75 |
| Zeros | 2215378 |
| Zeros (%) | 83.5% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.6299999952 |
| Maximum | 441.75 |
| Range | 441.75 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.584473869 |
|---|---|
| Coefficient of variation (CV) | 7.9751564 |
| Kurtosis | 4880.659121 |
| Mean | 0.1986762127 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 37.62493565 |
| Sum | 526861.7 |
| Variance | 2.510557442 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 2215378 | |
| 0.01 | 246986 | 9.3% |
| 0.009999999776 | 12808 | 0.5% |
| 0.01999999955 | 5885 | 0.2% |
| 0.6800000072 | 808 | < 0.1% |
| 0.4499999881 | 793 | < 0.1% |
| 0.4300000072 | 780 | < 0.1% |
| 0.4900000095 | 776 | < 0.1% |
| 0.3899999857 | 774 | < 0.1% |
| 0.3100000024 | 770 | < 0.1% |
| Other values (3865) | 166103 | 6.3% |
| Value | Count | Frequency (%) |
| 0 | 2215378 | |
| 0.009999999776 | 12808 | 0.5% |
| 0.01 | 246986 | 9.3% |
| 0.01999999955 | 5885 | 0.2% |
| 0.02 | 9 | < 0.1% |
| Value | Count | Frequency (%) |
| 441.75 | 1 | |
| 333.6300049 | 1 | |
| 254.3999939 | 1 | |
| 251.2200012 | 1 | |
| 227.2100067 | 1 |
Side
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.2 MiB |
| R | |
|---|---|
| L | |
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 2651861 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | R |
|---|---|
| 2nd row | L |
| 3rd row | R |
| 4th row | R |
| 5th row | R |
| Value | Count | Frequency (%) |
| R | 2115329 | |
| L | 536531 | 20.2% |
| 1 | < 0.1% |
| Value | Count | Frequency (%) |
| r | 2115329 | |
| l | 536531 | 20.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| R | 2115329 | |
| L | 536531 | 20.2% |
| 1 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2651860 | |
| Space Separator | 1 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| R | 2115329 | |
| L | 536531 | 20.2% |
| Value | Count | Frequency (%) |
| 1 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2651860 | |
| Common | 1 | < 0.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| R | 2115329 | |
| L | 536531 | 20.2% |
| Value | Count | Frequency (%) |
| 1 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2651861 |
Most frequent character per block
| Value | Count | Frequency (%) |
| R | 2115329 | |
| L | 536531 | 20.2% |
| 1 | < 0.1% |
State
Categorical
| Distinct | 49 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 20.2 MiB |
| CA | |
|---|---|
| TX | |
| FL | |
| SC | |
| NC | |
| Other values (44) |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 2 |
| Min length | 2 |
Characters and Unicode
| Total characters | 5303722 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | OH |
|---|---|
| 2nd row | OH |
| 3rd row | OH |
| 4th row | OH |
| 5th row | OH |
| Value | Count | Frequency (%) |
| CA | 483157 | |
| TX | 299660 | 11.3% |
| FL | 214170 | 8.1% |
| SC | 184379 | 7.0% |
| NC | 141834 | 5.3% |
| NY | 126453 | 4.8% |
| PA | 92719 | 3.5% |
| MI | 77183 | 2.9% |
| VA | 76361 | 2.9% |
| GA | 75310 | 2.8% |
| Other values (39) | 880635 |
| Value | Count | Frequency (%) |
| ca | 483157 | |
| tx | 299660 | 11.3% |
| fl | 214170 | 8.1% |
| sc | 184379 | 7.0% |
| nc | 141834 | 5.3% |
| ny | 126453 | 4.8% |
| pa | 92719 | 3.5% |
| mi | 77183 | 2.9% |
| va | 76361 | 2.9% |
| ga | 75310 | 2.8% |
| Other values (39) | 880635 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 997313 | |
| C | 862776 | |
| N | 490948 | |
| T | 405640 | |
| L | 391264 | 7.4% |
| X | 299660 | 5.6% |
| M | 234004 | 4.4% |
| F | 214170 | 4.0% |
| I | 203952 | 3.8% |
| S | 194102 | 3.7% |
| Other values (14) | 1009893 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 5303722 |
Most frequent character per category
| Value | Count | Frequency (%) |
| A | 997313 | |
| C | 862776 | |
| N | 490948 | |
| T | 405640 | |
| L | 391264 | 7.4% |
| X | 299660 | 5.6% |
| M | 234004 | 4.4% |
| F | 214170 | 4.0% |
| I | 203952 | 3.8% |
| S | 194102 | 3.7% |
| Other values (14) | 1009893 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5303722 |
Most frequent character per script
| Value | Count | Frequency (%) |
| A | 997313 | |
| C | 862776 | |
| N | 490948 | |
| T | 405640 | |
| L | 391264 | 7.4% |
| X | 299660 | 5.6% |
| M | 234004 | 4.4% |
| F | 214170 | 4.0% |
| I | 203952 | 3.8% |
| S | 194102 | 3.7% |
| Other values (14) | 1009893 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5303722 |
Most frequent character per block
| Value | Count | Frequency (%) |
| A | 997313 | |
| C | 862776 | |
| N | 490948 | |
| T | 405640 | |
| L | 391264 | 7.4% |
| X | 299660 | 5.6% |
| M | 234004 | 4.4% |
| F | 214170 | 4.0% |
| I | 203952 | 3.8% |
| S | 194102 | 3.7% |
| Other values (14) | 1009893 |
| Distinct | 826 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 45294 |
| Missing (%) | 1.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 62.55807182 |
|---|---|
| Minimum | -89 |
| Maximum | 203 |
| Zeros | 493 |
| Zeros (%) | < 0.1% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | -89 |
|---|---|
| 5-th percentile | 29 |
| Q1 | 50 |
| median | 64.9 |
| Q3 | 77 |
| 95-th percentile | 89.1 |
| Maximum | 203 |
| Range | 292 |
| Interquartile range (IQR) | 27 |
Descriptive statistics
| Standard deviation | 18.62876513 |
|---|---|
| Coefficient of variation (CV) | 0.2977835567 |
| Kurtosis | -0.01953030608 |
| Mean | 62.55807182 |
| Median Absolute Deviation (MAD) | 12.9 |
| Skewness | -0.5091188724 |
| Sum | 163061805.6 |
| Variance | 347.0308902 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 77 | 60762 | 2.3% |
| 68 | 59117 | 2.2% |
| 73 | 54275 | 2.0% |
| 59 | 52197 | 2.0% |
| 75 | 50638 | 1.9% |
| 72 | 50297 | 1.9% |
| 70 | 49209 | 1.9% |
| 63 | 46801 | 1.8% |
| 64 | 46359 | 1.7% |
| 79 | 45831 | 1.7% |
| Other values (816) | 2091081 | |
| (Missing) | 45294 | 1.7% |
| Value | Count | Frequency (%) |
| -89 | 7 | |
| -77.8 | 10 | |
| -33 | 1 | < 0.1% |
| -32.8 | 1 | < 0.1% |
| -29.9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 203 | 1 | |
| 189 | 1 | |
| 174 | 2 | |
| 167 | 1 | |
| 161.6 | 1 |
| Distinct | 100 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 48306 |
| Missing (%) | 1.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 66.32201279 |
|---|---|
| Minimum | 1 |
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 26 |
| Q1 | 50 |
| median | 69 |
| Q3 | 86 |
| 95-th percentile | 97 |
| Maximum | 100 |
| Range | 99 |
| Interquartile range (IQR) | 36 |
Descriptive statistics
| Standard deviation | 22.37503884 |
|---|---|
| Coefficient of variation (CV) | 0.3373697194 |
| Kurtosis | -0.679863749 |
| Mean | 66.32201279 |
| Median Absolute Deviation (MAD) | 18 |
| Skewness | -0.4242431196 |
| Sum | 172673008 |
| Variance | 500.6423633 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 100 | 114775 | 4.3% |
| 93 | 104841 | 4.0% |
| 90 | 65221 | 2.5% |
| 87 | 63477 | 2.4% |
| 96 | 52004 | 2.0% |
| 94 | 49241 | 1.9% |
| 89 | 47940 | 1.8% |
| 84 | 47352 | 1.8% |
| 81 | 44877 | 1.7% |
| 82 | 43964 | 1.7% |
| Other values (90) | 1969863 | |
| (Missing) | 48306 | 1.8% |
| Value | Count | Frequency (%) |
| 1 | 2 | < 0.1% |
| 2 | 10 | < 0.1% |
| 3 | 48 | < 0.1% |
| 4 | 515 | |
| 5 | 1064 |
| Value | Count | Frequency (%) |
| 100 | 114775 | |
| 99 | 3939 | 0.1% |
| 98 | 2219 | 0.1% |
| 97 | 35367 | 1.3% |
| 96 | 52004 |
| Distinct | 988 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 38857 |
| Missing (%) | 1.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29.7739001 |
|---|---|
| Minimum | 0 |
| Maximum | 58.04 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 28.85 |
| Q1 | 29.73 |
| median | 29.95 |
| Q3 | 30.09 |
| 95-th percentile | 30.33 |
| Maximum | 58.04 |
| Range | 58.04 |
| Interquartile range (IQR) | 0.36 |
Descriptive statistics
| Standard deviation | 0.7501909159 |
|---|---|
| Coefficient of variation (CV) | 0.02519625959 |
| Kurtosis | 48.36963466 |
| Mean | 29.7739001 |
| Median Absolute Deviation (MAD) | 0.17 |
| Skewness | -5.191468097 |
| Sum | 77799320.06 |
| Variance | 0.5627864103 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 30.01 | 53925 | 2.0% |
| 29.96 | 52678 | 2.0% |
| 29.99 | 52632 | 2.0% |
| 30.04 | 51724 | 2.0% |
| 29.94 | 50188 | 1.9% |
| 30.06 | 49674 | 1.9% |
| 29.91 | 46084 | 1.7% |
| 30.03 | 45915 | 1.7% |
| 30 | 45057 | 1.7% |
| 30.09 | 45045 | 1.7% |
| Other values (978) | 2120082 |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 0.12 | 1 | < 0.1% |
| 0.29 | 2 | |
| 0.3 | 4 | |
| 0.39 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 58.04 | 1 | |
| 33.04 | 2 | |
| 31.15 | 1 | |
| 31.14 | 1 | |
| 31.12 | 2 |
| Distinct | 77 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 53002 |
| Missing (%) | 2.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.100616274 |
|---|---|
| Minimum | 0 |
| Maximum | 140 |
| Zeros | 1133 |
| Zeros (%) | < 0.1% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 10 |
| median | 10 |
| Q3 | 10 |
| 95-th percentile | 10 |
| Maximum | 140 |
| Range | 140 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 2.773215094 |
|---|---|
| Coefficient of variation (CV) | 0.3047282745 |
| Kurtosis | 76.02581646 |
| Mean | 9.100616274 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.924693413 |
| Sum | 23651218.51 |
| Variance | 7.690721956 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 2064841 | |
| 7 | 82235 | 3.1% |
| 9 | 71282 | 2.7% |
| 8 | 56596 | 2.1% |
| 5 | 52816 | 2.0% |
| 6 | 45955 | 1.7% |
| 4 | 42294 | 1.6% |
| 3 | 41422 | 1.6% |
| 2 | 34538 | 1.3% |
| 1 | 23969 | 0.9% |
| Other values (67) | 82911 | 3.1% |
| (Missing) | 53002 | 2.0% |
| Value | Count | Frequency (%) |
| 0 | 1133 | |
| 0.06 | 114 | < 0.1% |
| 0.1 | 996 | |
| 0.12 | 362 | < 0.1% |
| 0.19 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 140 | 1 | < 0.1% |
| 111 | 2 | < 0.1% |
| 105 | 1 | < 0.1% |
| 101 | 1 | < 0.1% |
| 100 | 7 |
| Distinct | 24 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 40241 |
| Missing (%) | 1.5% |
| Memory size | 20.2 MiB |
| Calm | |
|---|---|
| CALM | 164998 |
| SSW | 136302 |
| South | 134871 |
| SW | 129211 |
| Other values (19) |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 3.253866183 |
| Min length | 1 |
Characters and Unicode
| Total characters | 8497862 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Calm |
|---|---|
| 2nd row | Calm |
| 3rd row | SW |
| 4th row | SW |
| 5th row | SW |
| Value | Count | Frequency (%) |
| Calm | 285016 | 10.7% |
| CALM | 164998 | 6.2% |
| SSW | 136302 | 5.1% |
| South | 134871 | 5.1% |
| SW | 129211 | 4.9% |
| SSE | 125424 | 4.7% |
| WNW | 123072 | 4.6% |
| West | 121874 | 4.6% |
| WSW | 119376 | 4.5% |
| NW | 118928 | 4.5% |
| Other values (14) | 1152548 |
| Value | Count | Frequency (%) |
| calm | 450014 | |
| ssw | 136302 | 5.2% |
| south | 134871 | 5.2% |
| sw | 129211 | 4.9% |
| sse | 125424 | 4.8% |
| wnw | 123072 | 4.7% |
| west | 121874 | 4.7% |
| wsw | 119376 | 4.6% |
| nw | 118928 | 4.6% |
| north | 116153 | 4.4% |
| Other values (13) | 1036395 |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 1175417 | |
| W | 1160207 | |
| N | 1008720 | |
| E | 894911 | |
| a | 542454 | 6.4% |
| t | 451214 | 5.3% |
| C | 450014 | 5.3% |
| l | 374577 | 4.4% |
| m | 285016 | 3.4% |
| o | 251024 | 3.0% |
| Other values (12) | 1904308 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 5411221 | |
| Lowercase Letter | 3086641 |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 542454 | |
| t | 451214 | |
| l | 374577 | |
| m | 285016 | |
| o | 251024 | |
| h | 251024 | |
| e | 211435 | 6.9% |
| r | 205714 | 6.7% |
| s | 200190 | 6.5% |
| u | 134871 | 4.4% |
| Other values (2) | 179122 | 5.8% |
| Value | Count | Frequency (%) |
| S | 1175417 | |
| W | 1160207 | |
| N | 1008720 | |
| E | 894911 | |
| C | 450014 | 8.3% |
| A | 210797 | 3.9% |
| L | 164998 | 3.0% |
| M | 164998 | 3.0% |
| V | 135360 | 2.5% |
| R | 45799 | 0.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8497862 |
Most frequent character per script
| Value | Count | Frequency (%) |
| S | 1175417 | |
| W | 1160207 | |
| N | 1008720 | |
| E | 894911 | |
| a | 542454 | 6.4% |
| t | 451214 | 5.3% |
| C | 450014 | 5.3% |
| l | 374577 | 4.4% |
| m | 285016 | 3.4% |
| o | 251024 | 3.0% |
| Other values (12) | 1904308 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 8497862 |
Most frequent character per block
| Value | Count | Frequency (%) |
| S | 1175417 | |
| W | 1160207 | |
| N | 1008720 | |
| E | 894911 | |
| a | 542454 | 6.4% |
| t | 451214 | 5.3% |
| C | 450014 | 5.3% |
| l | 374577 | 4.4% |
| m | 285016 | 3.4% |
| o | 251024 | 3.0% |
| Other values (12) | 1904308 |
| Distinct | 139 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 344548 |
| Missing (%) | 13.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.094265321 |
|---|---|
| Minimum | 0 |
| Maximum | 822.8 |
| Zeros | 164999 |
| Zeros (%) | 6.2% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 4.6 |
| median | 7 |
| Q3 | 10.4 |
| 95-th percentile | 17 |
| Maximum | 822.8 |
| Range | 822.8 |
| Interquartile range (IQR) | 5.8 |
Descriptive statistics
| Standard deviation | 5.130874422 |
|---|---|
| Coefficient of variation (CV) | 0.6338900714 |
| Kurtosis | 1835.324876 |
| Mean | 8.094265321 |
| Median Absolute Deviation (MAD) | 2.4 |
| Skewness | 13.54744091 |
| Sum | 18676003.6 |
| Variance | 26.32587234 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 4.6 | 168473 | 6.4% |
| 5.8 | 166509 | 6.3% |
| 0 | 164999 | 6.2% |
| 3.5 | 157352 | 5.9% |
| 6.9 | 154424 | 5.8% |
| 8.1 | 137879 | 5.2% |
| 9.2 | 122169 | 4.6% |
| 10.4 | 99890 | 3.8% |
| 5 | 98431 | 3.7% |
| 6 | 95379 | 3.6% |
| Other values (129) | 941808 | |
| (Missing) | 344548 | 13.0% |
| Value | Count | Frequency (%) |
| 0 | 164999 | |
| 1 | 72 | < 0.1% |
| 1.2 | 319 | < 0.1% |
| 2 | 156 | < 0.1% |
| 2.3 | 640 | < 0.1% |
| Value | Count | Frequency (%) |
| 822.8 | 5 | |
| 703.1 | 2 | < 0.1% |
| 580 | 2 | < 0.1% |
| 328 | 1 | < 0.1% |
| 255 | 1 | < 0.1% |
| Distinct | 256 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 1529311 |
| Missing (%) | 57.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.01595538729 |
|---|---|
| Minimum | 0 |
| Maximum | 25 |
| Zeros | 937421 |
| Zeros (%) | 35.3% |
| Memory size | 20.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0.07 |
| Maximum | 25 |
| Range | 25 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.1849330967 |
|---|---|
| Coefficient of variation (CV) | 11.59063665 |
| Kurtosis | 2806.97326 |
| Mean | 0.01595538729 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 49.31946349 |
| Sum | 17910.72 |
| Variance | 0.03420025026 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 937421 | |
| 0.01 | 51978 | 2.0% |
| 0.02 | 25730 | 1.0% |
| 0.03 | 17459 | 0.7% |
| 0.04 | 12841 | 0.5% |
| 0.05 | 10346 | 0.4% |
| 0.06 | 8267 | 0.3% |
| 0.07 | 6766 | 0.3% |
| 0.08 | 5417 | 0.2% |
| 0.09 | 4825 | 0.2% |
| Other values (246) | 41500 | 1.6% |
| (Missing) | 1529311 |
| Value | Count | Frequency (%) |
| 0 | 937421 | |
| 0.01 | 51978 | 2.0% |
| 0.02 | 25730 | 1.0% |
| 0.03 | 17459 | 0.7% |
| 0.04 | 12841 | 0.5% |
| Value | Count | Frequency (%) |
| 25 | 1 | |
| 10.8 | 1 | |
| 10.18 | 1 | |
| 10.16 | 1 | |
| 10.14 | 2 |
| Distinct | 121 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 52931 |
| Missing (%) | 2.0% |
| Memory size | 20.2 MiB |
| Clear | |
|---|---|
| Fair | |
| Mostly Cloudy | |
| Overcast | |
| Partly Cloudy | |
| Other values (116) |
Length
| Max length | 35 |
|---|---|
| Median length | 8 |
| Mean length | 8.383881059 |
| Min length | 3 |
Characters and Unicode
| Total characters | 21789120 |
|---|---|
| Distinct characters | 45 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 12 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | Light Rain |
|---|---|
| 2nd row | Light Rain |
| 3rd row | Overcast |
| 4th row | Mostly Cloudy |
| 5th row | Mostly Cloudy |
| Value | Count | Frequency (%) |
| Clear | 618969 | |
| Fair | 416145 | |
| Mostly Cloudy | 370250 | |
| Overcast | 290639 | |
| Partly Cloudy | 257555 | |
| Cloudy | 156239 | 5.9% |
| Scattered Clouds | 155332 | 5.9% |
| Light Rain | 131040 | 4.9% |
| Light Snow | 34489 | 1.3% |
| Rain | 30462 | 1.1% |
| Other values (111) | 137810 | 5.2% |
| (Missing) | 52931 | 2.0% |
| Value | Count | Frequency (%) |
| cloudy | 791514 | |
| clear | 618969 | |
| fair | 420668 | |
| mostly | 373102 | |
| overcast | 290639 | 7.9% |
| partly | 259248 | 7.1% |
| light | 187692 | 5.1% |
| rain | 187642 | 5.1% |
| clouds | 155332 | 4.2% |
| scattered | 155332 | 4.2% |
| Other values (48) | 225406 | 6.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 2214651 | 10.2% |
| a | 1989459 | 9.1% |
| r | 1798665 | 8.3% |
| C | 1565832 | 7.2% |
| y | 1461871 | 6.7% |
| t | 1454018 | 6.7% |
| o | 1415521 | 6.5% |
| e | 1315553 | 6.0% |
| d | 1145018 | 5.3% |
| 1066614 | 4.9% | |
| Other values (35) | 6361918 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 17064018 | |
| Uppercase Letter | 3636457 | 16.7% |
| Space Separator | 1066614 | 4.9% |
| Other Punctuation | 16141 | 0.1% |
| Dash Punctuation | 5890 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| l | 2214651 | |
| a | 1989459 | |
| r | 1798665 | |
| y | 1461871 | |
| t | 1454018 | |
| o | 1415521 | |
| e | 1315553 | |
| d | 1145018 | |
| u | 966245 | 5.7% |
| i | 849441 | 5.0% |
| Other values (14) | 2453576 |
| Value | Count | Frequency (%) |
| C | 1565832 | |
| F | 452954 | 12.5% |
| M | 376357 | 10.3% |
| O | 290639 | 8.0% |
| P | 262139 | 7.2% |
| S | 207848 | 5.7% |
| L | 187696 | 5.2% |
| R | 187642 | 5.2% |
| H | 45388 | 1.2% |
| T | 25013 | 0.7% |
| Other values (8) | 34949 | 1.0% |
| Value | Count | Frequency (%) |
| 1066614 |
| Value | Count | Frequency (%) |
| / | 16141 |
| Value | Count | Frequency (%) |
| - | 5890 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 20700475 | |
| Common | 1088645 | 5.0% |
Most frequent character per script
| Value | Count | Frequency (%) |
| l | 2214651 | |
| a | 1989459 | 9.6% |
| r | 1798665 | 8.7% |
| C | 1565832 | 7.6% |
| y | 1461871 | 7.1% |
| t | 1454018 | 7.0% |
| o | 1415521 | 6.8% |
| e | 1315553 | 6.4% |
| d | 1145018 | 5.5% |
| u | 966245 | 4.7% |
| Other values (32) | 5373642 |
| Value | Count | Frequency (%) |
| 1066614 | ||
| / | 16141 | 1.5% |
| - | 5890 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 21789120 |
Most frequent character per block
| Value | Count | Frequency (%) |
| l | 2214651 | 10.2% |
| a | 1989459 | 9.1% |
| r | 1798665 | 8.3% |
| C | 1565832 | 7.2% |
| y | 1461871 | 6.7% |
| t | 1454018 | 6.7% |
| o | 1415521 | 6.5% |
| e | 1315553 | 6.0% |
| d | 1145018 | 5.3% |
| 1066614 | 4.9% | |
| Other values (35) | 6361918 |
Sunrise_Sunset
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 57 |
| Missing (%) | < 0.1% |
| Memory size | 20.2 MiB |
| Day | |
|---|---|
| Night |
Length
| Max length | 5 |
|---|---|
| Median length | 3 |
| Mean length | 3.515517738 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9322464 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Night |
|---|---|
| 2nd row | Night |
| 3rd row | Night |
| 4th row | Night |
| 5th row | Day |
| Value | Count | Frequency (%) |
| Day | 1968278 | |
| Night | 683526 | 25.8% |
| (Missing) | 57 | < 0.1% |
| Value | Count | Frequency (%) |
| day | 1968278 | |
| night | 683526 | 25.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 1968278 | |
| a | 1968278 | |
| y | 1968278 | |
| N | 683526 | 7.3% |
| i | 683526 | 7.3% |
| g | 683526 | 7.3% |
| h | 683526 | 7.3% |
| t | 683526 | 7.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6670660 | |
| Uppercase Letter | 2651804 | 28.4% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 1968278 | |
| y | 1968278 | |
| i | 683526 | 10.2% |
| g | 683526 | 10.2% |
| h | 683526 | 10.2% |
| t | 683526 | 10.2% |
| Value | Count | Frequency (%) |
| D | 1968278 | |
| N | 683526 | 25.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 9322464 |
Most frequent character per script
| Value | Count | Frequency (%) |
| D | 1968278 | |
| a | 1968278 | |
| y | 1968278 | |
| N | 683526 | 7.3% |
| i | 683526 | 7.3% |
| g | 683526 | 7.3% |
| h | 683526 | 7.3% |
| t | 683526 | 7.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9322464 |
Most frequent character per block
| Value | Count | Frequency (%) |
| D | 1968278 | |
| a | 1968278 | |
| y | 1968278 | |
| N | 683526 | 7.3% |
| i | 683526 | 7.3% |
| g | 683526 | 7.3% |
| h | 683526 | 7.3% |
| t | 683526 | 7.3% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| TMC | Severity | Start_Time | End_Time | Start_Lat | Start_Lng | Distance(mi) | Side | State | Temperature(F) | Humidity(%) | Pressure(in) | Visibility(mi) | Wind_Direction | Wind_Speed(mph) | Precipitation(in) | Weather_Condition | Sunrise_Sunset | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 201.0 | 3 | 2016-02-08 05:46:00 | 2016-02-08 11:00:00 | 39.865147 | -84.058723 | 0.01 | R | OH | 36.9 | 91.0 | 29.68 | 10.0 | Calm | NaN | 0.02 | Light Rain | Night |
| 1 | 201.0 | 2 | 2016-02-08 06:07:59 | 2016-02-08 06:37:59 | 39.928059 | -82.831184 | 0.01 | L | OH | 37.9 | 100.0 | 29.65 | 10.0 | Calm | NaN | 0.00 | Light Rain | Night |
| 2 | 201.0 | 2 | 2016-02-08 06:49:27 | 2016-02-08 07:19:27 | 39.063148 | -84.032608 | 0.01 | R | OH | 36.0 | 100.0 | 29.67 | 10.0 | SW | 3.5 | NaN | Overcast | Night |
| 3 | 201.0 | 3 | 2016-02-08 07:23:34 | 2016-02-08 07:53:34 | 39.747753 | -84.205582 | 0.01 | R | OH | 35.1 | 96.0 | 29.64 | 9.0 | SW | 4.6 | NaN | Mostly Cloudy | Night |
| 4 | 201.0 | 2 | 2016-02-08 07:39:07 | 2016-02-08 08:09:07 | 39.627781 | -84.188354 | 0.01 | R | OH | 36.0 | 89.0 | 29.65 | 6.0 | SW | 3.5 | NaN | Mostly Cloudy | Day |
| 5 | 201.0 | 3 | 2016-02-08 07:44:26 | 2016-02-08 08:14:26 | 40.100590 | -82.925194 | 0.01 | R | OH | 37.9 | 97.0 | 29.63 | 7.0 | SSW | 3.5 | 0.03 | Light Rain | Day |
| 6 | 201.0 | 2 | 2016-02-08 07:59:35 | 2016-02-08 08:29:35 | 39.758274 | -84.230507 | 0.00 | R | OH | 34.0 | 100.0 | 29.66 | 7.0 | WSW | 3.5 | NaN | Overcast | Day |
| 7 | 201.0 | 3 | 2016-02-08 07:59:58 | 2016-02-08 08:29:58 | 39.770382 | -84.194901 | 0.01 | R | OH | 34.0 | 100.0 | 29.66 | 7.0 | WSW | 3.5 | NaN | Overcast | Day |
| 8 | 201.0 | 2 | 2016-02-08 08:00:40 | 2016-02-08 08:30:40 | 39.778061 | -84.172005 | 0.00 | L | OH | 33.3 | 99.0 | 29.67 | 5.0 | SW | 1.2 | NaN | Mostly Cloudy | Day |
| 9 | 201.0 | 3 | 2016-02-08 08:10:04 | 2016-02-08 08:40:04 | 40.100590 | -82.925194 | 0.01 | R | OH | 37.4 | 100.0 | 29.62 | 3.0 | SSW | 4.6 | 0.02 | Light Rain | Day |
Last rows
| TMC | Severity | Start_Time | End_Time | Start_Lat | Start_Lng | Distance(mi) | Side | State | Temperature(F) | Humidity(%) | Pressure(in) | Visibility(mi) | Wind_Direction | Wind_Speed(mph) | Precipitation(in) | Weather_Condition | Sunrise_Sunset | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2651851 | 201.0 | 3 | 2017-08-30 17:32:09 | 2017-08-30 18:16:19 | 33.774685 | -118.046837 | 0.0 | R | CA | 87.6 | 47.0 | 29.68 | 10.0 | South | 5.8 | NaN | Partly Cloudy | Day |
| 2651852 | 201.0 | 2 | 2017-08-30 17:31:39 | 2017-08-30 18:15:49 | 33.853939 | -117.906784 | 0.0 | R | CA | 93.0 | 34.0 | 29.66 | 10.0 | Variable | 3.5 | NaN | Clear | Day |
| 2651853 | 201.0 | 2 | 2017-08-30 17:54:40 | 2017-08-30 18:39:23 | 34.073830 | -118.233269 | 0.0 | R | CA | 88.0 | 40.0 | 29.67 | 10.0 | Variable | 4.6 | NaN | Clear | Day |
| 2651854 | 201.0 | 3 | 2017-08-30 18:04:19 | 2017-08-30 19:04:19 | 34.072350 | -117.938385 | 0.0 | R | CA | 98.6 | 27.0 | 29.69 | 10.0 | SSW | 6.9 | NaN | Partly Cloudy | Day |
| 2651855 | 201.0 | 2 | 2017-08-30 18:28:48 | 2017-08-30 18:57:54 | 34.173161 | -118.535988 | 0.0 | R | CA | 100.0 | 18.0 | 29.66 | 10.0 | WNW | 4.6 | NaN | Clear | Day |
| 2651856 | 201.0 | 3 | 2017-08-30 18:41:30 | 2017-08-30 19:11:07 | 34.495808 | -118.623932 | 0.0 | R | CA | 100.0 | 18.0 | 28.85 | 10.0 | WNW | 5.0 | 0.0 | Fair | Day |
| 2651857 | 201.0 | 3 | 2017-08-30 18:59:02 | 2017-08-30 19:27:57 | 34.031322 | -118.433723 | 0.0 | R | CA | 77.0 | 64.0 | 29.69 | 10.0 | SSW | 5.8 | NaN | Clear | Day |
| 2651858 | 201.0 | 3 | 2017-08-30 18:57:52 | 2017-08-30 19:26:11 | 34.106785 | -117.369102 | 0.0 | L | CA | 102.2 | 16.0 | 29.73 | 6.0 | SSW | 5.8 | NaN | Haze | Day |
| 2651859 | 201.0 | 3 | 2017-08-30 19:49:01 | 2017-08-30 20:18:00 | 33.924686 | -118.103981 | 0.0 | R | CA | 88.0 | 39.0 | 29.68 | 10.0 | West | 3.5 | NaN | Clear | Night |
| 2651860 | 201.0 | 2 | 2017-08-30 20:17:21 | 2017-08-30 20:47:21 | 33.729469 | -117.397354 | 0.0 | R | CA | 89.6 | 40.0 | 29.78 | 10.0 | South | 3.5 | NaN | Clear | Night |
Most frequent
| TMC | Severity | Start_Time | End_Time | Start_Lat | Start_Lng | Distance(mi) | Side | State | Temperature(F) | Humidity(%) | Pressure(in) | Visibility(mi) | Wind_Direction | Wind_Speed(mph) | Precipitation(in) | Weather_Condition | Sunrise_Sunset | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 533 | 201.0 | 3 | 2018-09-16 13:24:13 | 2018-09-16 13:51:54 | 33.978249 | -81.195084 | 0.0 | R | SC | 75.0 | 100.0 | 29.74 | 1.0 | SE | 13.8 | 0.01 | Rain | Day | 11 |
| 849 | 245.0 | 3 | 2020-11-12 06:28:16 | 2020-11-12 07:12:11 | 25.942879 | -80.187675 | 0.0 | R | FL | 78.0 | 100.0 | 29.91 | 10.0 | S | 5.0 | 0.00 | Partly Cloudy | Night | 11 |
| 843 | 245.0 | 3 | 2020-03-12 22:33:35 | 2020-03-12 23:02:42 | 33.585026 | -84.513145 | 0.0 | R | GA | 69.0 | 65.0 | 28.87 | 10.0 | SSW | 9.0 | 0.00 | Partly Cloudy | Night | 9 |
| 532 | 201.0 | 3 | 2018-09-16 13:24:12 | 2018-09-16 13:51:54 | 33.978249 | -81.195084 | 0.0 | R | SC | 75.0 | 100.0 | 29.74 | 1.0 | SE | 13.8 | 0.01 | Rain | Day | 8 |
| 801 | 241.0 | 3 | 2019-09-16 15:09:55 | 2019-09-16 15:54:02 | 32.907650 | -96.897278 | 0.0 | R | TX | 95.0 | 34.0 | 29.46 | 10.0 | E | 9.0 | 0.00 | Partly Cloudy | Day | 7 |
| 374 | 201.0 | 2 | 2020-08-14 07:43:26 | 2020-08-14 08:13:02 | 47.673512 | -117.467613 | 0.0 | R | WA | 56.0 | 45.0 | 27.67 | 10.0 | SE | 8.0 | 0.00 | Fair | Day | 4 |
| 429 | 201.0 | 2 | 2020-10-27 06:54:07 | 2020-10-27 07:53:21 | 38.713409 | -90.284042 | 0.0 | R | MO | 37.0 | 86.0 | 29.65 | 6.0 | NNE | 7.0 | 0.00 | Light Rain | Night | 4 |
| 658 | 201.0 | 3 | 2020-04-10 19:53:37 | 2020-04-10 20:21:52 | 38.843868 | -94.529579 | 0.0 | R | MO | 52.0 | 37.0 | 28.75 | 10.0 | SSE | 9.0 | 0.00 | Cloudy | Night | 4 |
| 845 | 245.0 | 3 | 2020-06-09 09:38:20 | 2020-06-09 10:07:39 | 29.768175 | -95.265442 | 0.0 | R | TX | 87.0 | 69.0 | 29.73 | 10.0 | SSW | 9.0 | 0.00 | Fair | Day | 4 |
| 166 | 201.0 | 2 | 2019-07-16 08:41:16 | 2019-07-16 09:40:58 | 33.414162 | -82.014641 | 0.0 | L | GA | 82.0 | 76.0 | 29.97 | 5.0 | CALM | 0.0 | 0.00 | Fair | Day | 3 |